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A master archer hits 

A target at a hundred yards because 

He skill possesses but, to make to meet 
Two arrows in mid-air, head-on, goes far 


Beyond the skill of ordinary man. 
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Abstract 
A collection of FORTRAN routines has been developed for use 
in minimum variance modelling and control. The routines can 
be grouped into five categories : time series modelling, 
SISO transfer function plus noise modelling, MIMO transfer 
function plus noise modelling, unconstrained minimum 
variance controller design and constrained minimum variance 
controller design. 

Within a category, routines can be grouped by the step 
at which they are used. For example, transfer function plus 
noise models require five steps : differencing, 
prewhitening, model order identification, preliminary 
estimation and least squares estimation. The main line 
routine used to perform each step calls on a number of 
Subroutines. After each step has been completed, the analyst 
must interpret the graphical and numeric outputs produced. 
The decision then is whether to continue to the next step, 
or change some input and rerun the current step. That is, 
the modelling process is interactive and iterative in 
natures 

Fourteen main line routines are required to run the 
individual steps in the five categories of tasks which can 
be performed. These main line routines rely on seventy-eight 
subroutines to perform the work. Altogether, this amounts to 
about ten thousand lines of code. 

The only way to reliably generate this amount of code 


is to have some standard practices, and a rich test 
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environment. The test environment was provided by a 
"neatness" program called *FTNTIDY, along with a very good 
checking compiler and run-time system, *IF. Some of the 
Standard practices were: 

1. one function per subroutine 

2. topedown tcoding 

3. chunk implementation and checkout 

4, consistent subroutine call argument format 

5. explicit declaration of all variables 

6. levels of nesting indicated by indentation 

7. errors handled within the routine where they occur 

As currently implemented, this set of subroutines is 
dependent on two features of MTS - the IMSL library, and the 
DLOuUE mn Carb rariy,. 

Of course, there are other time series packages. The 
package developed by the author differs from the MTS 
resident "Time Series Processor" by its extensive use of 
graphics to represent results. It also has the capability 
for transfer function plus noise modelling, and can derive 
minimum variance controllers from transfer function plus 
noise models. These additions make the package much more 
useful in control applications. 

The purpose of this thesis is to develop the background 
necessary for the use of this subroutine library. This will 
serve aS an introduction to simple stochastic modelling, and 
minimum variance control, as well as the foundations of 


adaptive control. 
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1. INTRODUCTION 
In practice, estimating the possible return on a computer 
control application is difficult. Applications are selected 
on the basis of possible benefits from a change in the mean 
value of some key variable(s). The aim is to reduce the 
amount of off-specification product, increase throughput, 
decrease energy consumption, etc. 

But substantial gains can also result from just 
reducing the variance of a key variable. This can occur when 
profit is related to the key variable by a nonlinear 
function. 

For example, the ratio of hydrogen to nitrogen (H/N 
ratio) at the synthesis converter inlet is a key variable in 
ammonia plants. Figure 1.1 presents a typical plant 
performance function, as tons per day of production versus 
H/N ratio, for a given set of operating conditions. This 


function can be approximated as: 


y = 997.5 - 937.5(u-2.9)? (1.1) 
where: 

y = tons/day production 

Ue=8H/N catio 


Assume! that, without computer control, the H/N ratio is 
2.9+0.3. Also assume that a combination of gas chromatograph 
and hydrogen analyzer could be used in a computer control 
scheme to produce a ratio of 2.9+0.05. 

If the distribution of H/N ratio is uniform (all values 
equally likely), then computer control can change the 


average production from 960.0 tons/day, to 996.5 tons/day. 


sia fits ut. Peony 
edz sovher Of Be de a. re B 
sugdpue tis sessioni | #autoag Aa awat? insee-B3c to 
33 40d! saquenee yorste sonora 


s ha 


devt modi) dfoeer o2@fe feo aciee. taj #fasedue tus a 
nidw ween wap ela? ,.sidscis sant i io acfeytisy att sie 


seenttnon s ed aidetasy coal aaa et Bateies 2 +htomg, © 4 7 
.setsoniih 


tH) Kupaeet o7 Mepozs vii 16 ‘et! a i: igqgnaxs 20% ae 
ae 


4 : « 


At aidetvev VeW « ek t4fni vosisvnov “Béearstye end 38 (olget 
tneig Isotqys « ecmeasia |. sap? .adietq atasiemg: 

eveiev noticubeag) 1, vib. bag .2do2-en (roksan Sams 
sie .ancesitaen otk shee Qo. Jae’ navte 6 tot gical 

£26 beadutscigas at 95> rotsanua - 

(at) a; WevesEye cree -pa Ege, @ &i | .— 

: nol tounetg YSH\enoy9 = ¥ 

; SN yt) Ne: A otek mS au 

at cite: WAH odd ,fotenos re sucmos jundgiw ,jedd omubeA 


Mosesasaati ene tb Wiseiidnes 5 tear suunes ona 808 | 
ant beau oi Bivod Tkeylens nepotbyd Sam 
7 Dian 22 tan ieiri 03 ses 


- 


Sens = owe a leet ay 


O00°SOOT O00°000T 00°S66 ek 


ear 
ee 
———— inh 
ae 
ae 
ae ts 
4 
ie 
Die ck 
ee 
ee aaa Sen ae ae ae ee er 
66 OG° Sa 00°08 00°SLE5 00° OLE CO*S96 


AVG/SNOL) NOILONGOYd SHN 


-20 


2.96 


H/N RATIO 


2-88 


00'096~ 


.1 Ammonia Plant Performance Function 
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Figure 


If the distribution is normal (i.e. Gaussian), the change is 
from 992.8 tons/day to 996.9. tons/day. 

That is, without changing the mean value of the key 
variable, production can be increased. 

Ingother sutuatiaons, (eng.1 fractionatore control), the 
reduced variance permits operation closer to a constraint. 
Here, the prime benefit is from a change in the mean value, 
once the variation is reduced to permit the change. 

The objective of this thesis is to present, in a clear, 
concise manner, a set of procedures for achieving minimum 
variation. The intention is to be reasonably mathematically 
rigorous. However, there are some basic concepts left 
undefined (e.g. expected value vs. wide sense expectation). 
Also, as this thesis is intended to present only one method, 
concepts not central to the development have been skipped, 
or merely hinted at (e.g. state estimators). It is hoped 
that the result is a clearly defined introduction to the 
broad field of system identification and parameter 
estimation. 

As part of this-introduction to the field, an extensive 
set of tools will be presented. These tools have been 
implemented as a library of FORTRAN subroutines They deal 
with progressively more complex situations in modelling. A 
faciddtyiftorrderiving.variousstypes,ofecontrolderefrom, these 
models has also been included. The chief advantage of this 
set of tools is the rich graphical output which is 


indispensable in this application. Consistency of 
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implementation and documentation are also important factors 
recommending the library's use. 

However, as well as these tools for offline analysis, 
tools for online, open-loop experimentation are required. 
These are not included in the library, since their 
implementation is highly situation dependent. 

In the following chapters, the background necessary for 
the use of the library will be built up stepwise. Chapter 2 
sets out the general notions of model building, and examines 
various techniques briefly. The correlation method for 
building input-output models of stochastic plants is then 
outlined. The foundation of the correlation method is 
examined more closely in Chapter 3. This chapter sets out 
three fundamental tools - autocorrelation, partial 
autocorrelation, and power spectrum. This is the basic 
toolkit for both time series and transfer function 
modelling. (Time series modelling is not central to this 
thesis, but the technique is of great significance. The 
topic is detailed in Appendix A.) Transfer function plus 
noise modelling (Chapter 4) follows naturally from Chapter 
3. Here, definition of the modelling procedure is completed, 
and illustrated with an example. 

Finally, Chapter 5 addresses the problem of deriving a 
control law from the plant model. The chapter ends with an 
illustration of control of the plant modelled in Chapter 4. 
Chapter 6 presents a series of conclusions and observations 


on the method. 
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2. MODELLING STOCHASTIC PLANTS 
In this chapter, we will introduce some basic concepts and 
terminology. First, we present a few categories into which 
models may be set. Our purpose is to indicate the nature of 
the models and techniques to be developed. Next, two 
important methods for describing systems - via state space 
and transfer function models - are considered briefly. These 
first sections serve to indicate the general characteristics 
of the models to be used, and to specify their form. The 
remaining sections outline procedures which can be used in 


the building of these models. 


2.1 Categories of Models 

There are two basic approaches to modelling the dynamic 
behavior of a system. The first approach could be called 
theoretical model building. It requires the following steps: 

1. Apply basic conservation laws to the process being 
studied. 

2. Derive the ordinary or partial differential/difference 
equations describing those aspects of the system's 
behavior which are of interest. 

3. Estimate any unknown model parameters. 

4, Verify and test the model. 

The second approach is experimental determination of the 

system model, on the basis of input-output data. In this 

case we obtain a representation only of the observable and 


controllable part of the system. 
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For an existing plant, the best "model" is the plant 
itself. That is, experimental methods will involve the whole 
plant (valve positioners, piping, distributor plates, etc.) 
Theoretical models can never be fully descriptive, but are 
indispensable for predicting either the behavior of new 
plants, or the results of extreme changes in operating 
conditions. 

This work will concentrate on experimental models. 
Further, the models considered will be written in discrete 
time, rathersthanecontinucus time,*to-allow for: 

1. Slow measurements (e.g. laboratory analyses) 

2. time-shared instruments (e.g. gas chromatographs) 

3. use of digital computers in data acquisition and control 
Models can also be differentiated on the basis of whether or 
not they attempt to deal with "noise". In the deterministic 
case, either there is no noise, or it is negligible. Strejc 
(1981) presents a summary of the most common deterministic 
methods. With such methods, evaluation of plant measurements 
yields exact values for model parameters. 

Techniques for developing deterministic models fail at 
low signal to noise ratios. In this region, system response 
to a test input cannot easily be distinguished from the 
system response to the noise that is normally present. 

Stochastic models are based on statistical methods, to 
account for the effects of noise. This noise may be assumed 
to act at various places in the system (input, measurement, 


etc.) The noise itself is usually unmeasureable. Evaluations 
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of plant measurements yield only estimates of the parameters 


in the plant model. To apply such methods, we should have 


some notion as to what characterizes a "good" estimator. 


Some definitions of desirable properties follow: 


D. 


Unbiasedness: Let @ be an estimate of a parameter vector 
6, 1E-B{6}-= 6, then @ iS an unbiased estimate of 6. (E 
denotes expected value.) 

Consistency: The estimate g converges in probability to 
6 as the number of samples, N, increases. 


lim {Pr(|@ - @|) = 0} = 1 


N+ 

Consistency implies unbiasedness, but not vice versa. 
Efficiency: 6 is an efficient estimate if no other 
unbiased estimator has smaller variance than 

E{ (6-6) (6-6)}. 

SULLICIenGy: BASSULELCIEnteestimator contains all@the 
information in the observations regarding the parameters 
to be estimated. A sufficient estimate 6 of @, based on 
the sample vector Y, is such that the conditional 
expectation B{6|yv} is independent of @. 


Estimators have been characterized by Strejc (1981) 


AGGOrding towthe error cost function and probabilistic 


concept as follows: 


Bayes estimation (Ho and Lee,(1964); Peterka(1978) ) 
Maximum likelihood (Deutsch,(1965); Astrom, (1980) ) 
Markov estimate (Deutsch, (1965) ) 

Stochastic approximation (Kiefer and Wolfowitz, (1952) ) 


Least squares (Strejc,(1980)) 
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6. Weighted least squares (Deutsch, (1965)) 

7. Generalized least squares (Eykhoff,(1974); 
Hastings-James and Sage, (1969) ) 

8. Extended least squares (Young and Hastings-James, (1970) ) 

9. Square root filtering (Peterka, (1975)) 

10. Kalman-Bucy filtering (Kalman and Bucy, (1967)) 

11. Instrumental variables (Kendall and Stuart,(1961); Young 
and Jakeman, (1979) ) 

That 1S, many eStimators are available, all with varying 

strengths. Each has its proper applications. For example, 

least Squares cannot be used in closed loop. Instrumental 

variables prove their worth in such an application. 

Finally, models can be characterized as parametric or 
nonparametric. The essential difference is that parametric 
models can be economic in their use of parameters. They 
assume some knowledge of the plant's structure. 
Nonparametric models are prodigal in their use of 
Parameters. When using parametric methods, if the structure 
of the plant is unknown, an iterative search must be 
followed. This can sometimes be avoided by the use of a 
nonparametric method. For example, the step response of a 
system is a nonparametric model which can yield insight into 
the system structure. 

Such a two step approach to modelling illustrates that 
we really -have two problems to consider. First is the 
problem of identification of the plant structure. The second 


problem is the estimation of the unknown parameters. Two 
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families of models commonly used are state space and 


Eranstersfunceion7!onrinputroutputys models: 


2.2 State Space Models 

The behavior of a linear dynamic stochastic system can 
be described by the state variable model : 
x(k+1) 
y (k+1) 
z(k+1) 


okay koix( kwtot (kady k uth) ee 0 (keds k) wl k) 
H(k+1)x(k+1) (2k) 
yikrlvestevitkté) 


In this model, x is the state vector, which is sufficient to 
specify the position of the system. u is the forcing or 
control vector; w is the disturbance to the state. y is the 
true system output, while z is the measurement corrupted by 
noise v. 

The state vector, control vector and meaSurement are 
all zero mean with dimensions s, t, and r. The sequences 
{v(k)} and {w(k)} are independently distributed with zero 
mean, and covariances V and W respectively. 

This type of model can be derived from a theoretical 
analysis of the process. The state variables may or may not 
represent observable and/or controllable system attributes. 
It can be shown (Kalman,(1963)) that there are an infinite 
number of state representations of the same system. This is 
due to the fact that the relationship between the input u, 
and the output y is not affected by a nonsingular linear 
transformation of state variables. Two models in the form of 
Equations (2.1) are said to be equivalent if: 


12 Their input-output relationships aresethe same forgthe 
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case of no noise. (v = w= 0) 
2. The statistical properties of the outputs of the 


unforced systems are the same. (u = 0) 


2.3 Input-Output Models 

Besides simply changing the state vector, there is 
another transformation of Equations (2.1) which is of 
interest. By applying the Kalman filter theorem to Equations 
(2.1), we obtain: 


R(k+1) = O(k+1,k)k(k 
( 


17k 
Vie) = GU) SC) eet eD ( 


PR MIK) BPSK CK 1) artk) 
a(k) 


) + 
k)u( (2.572) 
where x(k) is the conditional expectation of x(k), given 
y(k-1), y(k-2), ... and a(k) is independently randomly 
distributed with zero mean, and covariance R. Equation (2.2) 


reduces to: 


Vik) -="~Btz ?) atk O+Pelzabytatk) 
OT) Ca yy (293) 


for systems with one input and one output (Eykhoff,(1974)), 
As before, y and u are zero mean; a is independently 
randomly distributed. 

The common denominator in Equation (2.3) implies that 
DOth ehewinpucrsu, and the noise, a, are processed along 
different paths of the same system. Generalizing slightly to 
avoid this assumption, we obtain: 


Vice ze) oz Pru meme (zat) kK) 
Sze) (zit > (254) 


where the polynomials are defined as: 
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and b is the number of whole periods of delay. 

Equation (2.4) can also be obtained by representing the 
deterministic dynamic behavior of the system as the linear 
seed Ge See 
Vickie amy Gi) ke as) ii =-0fet. +; - = (26% 
in which the output, y, is represented as a linear 
combination of a series of impulse response weights, {v(i)}, 
scaled by the input {u(k-i)}. This is analogous to the 
convolution integral for continuous systems. Making use of 
operator notation, Equation (2.6) becomes: 


y(k) = v(z-')u(k), v(z-') = Ev(i)z-! i = 0, ..., 4 
(2.7) 


The operator v(z~') is called the transfer function of the 
filter. To parameterize the system somewhat more 
economically, v(z~') can be represented as the rational 
operator: 


v(z-') = w(z-t) 2-P 
o(z-') (27-35) 


where the value for b reflects the fact that one or more of 
the initial v(i) may be zero. Substituting Equation (2.8) 
DACKmINnLORECUaAL1 Ona (2200) 3 


y(k) = w(z71) z-bu(k) 
5(z-') (229) 


To include the effects of noise on the output signal, let: 


Vikimaew (Zee z-Pu(k) + n(k) 
Sze) (oem 


where {n(k)} is an independently, randomly distributed 


sequence. 
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The noise sequence may in turn be modelled as the 


output of a linear filter with random input: 


Thus, we arrive back at Equation (2.4): 


y(k) = w(z71) z-Pu(k) + 6(z-')a(k) 
Stzae) o(z-') (284) 


2.4 Frequency Response Analysis 

Given a restricted class of models to consider (e.g. 
state space or input-output), the task in a particular 
application will be to define the system parameters. 
Historically, this began with frequency response analysis of 
transfer functions. The simplest case is that of a sine wave 
forcing to a linear process with noise-free output. Here, 
the system's amplitude and phase response can be determined 
GPOMtasErecoOrding) of the? inputyeuy, andsoutput, y: 


IG(jw)| = ¥ +60)" =5=tS3 60" 
U 7 C2Ra 2) 


where T is the period of the input, and t 1s the amount of 
time by which the output lags the input. The amplitude ratio 
and phase angle, ¢, are related to the transfer function by: 
G(jw) = |G(jw) le? (20513) 

Both depend on the input frequency. The transfer function 
Calpeine principle, be found = frome the response of sthe: process 
over the entire range of frequencies, O<swso. 

There are two common graphical means of representing 
frequency response. The first is a pair of plots - amplitude 


ratio and phase angle versus frequency. This is called the 
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Bode diagram. The second is a polar plot of G(jw) in the 
complex plane, called the Nyquist diagram. 

Frequency response methods have become more practical 
with the advent of digital spectral analysis, and numerical 
Fourier transforms. However, these methods will give 
Satisfactory results only if the output signal to noise. 
ratio is very high. 

Jenkins and Watts (1968), and Wellstead (1981) describe 
nonparametric techniques of spectral analysis in detail. 
Ljung and Glover (1981) contrast the features of frequency 
and time domain methods. They conclude that the two 


approaches complement each other. 


2.5 Impulse Response Analysis 
If we use an input signal of the form: 
u(t) = ké(t) | (2.14) 
where 6 is the Dirac delta, the process output will be: 
yobs) ee ikevy te) eben Ce) (25015) 
where v(t) is the impulse response, and n(t) is noise 
corrupting the output. In practice, 5(t) has to be 
approximated by impulse functions with short but finite 
duration, and limited amplitude. In this case, inversion of 


the convolution: 


GS TNAC eR Gee ie el 7 Oke NED 
0 
will determine the impulse response. However, deconvolution 


may suffer significantly from ill-conditioned inputs and/or 
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nonnegligible noise (Rake,(1980)). Thus, the approximate 
approach is impractical. 
The currently accepted approach to impulse response 


analysis involves the use of statistical methods. 


2.6 Correlation Methods 

Correlation methods are based on the analysis of 
autocovariance and cross-covariance functions of stochastic 
processes. 

The autocovariance function is an important statistical 
property of a process, defined as: 

yGUGtipe ye, Cove = El uUCry ucts ye C2ee1 7) 

where the operator "E" denotes expected value. If the 
stochastic process is weakly stationary, a term that will be 
defined in detail later, then the covariance depends only on 


the time difference |t; - tz 


ee Oethingmthis, equaleto 7, 

vii er) =e el u(t utes ) 4 (29218) 
Similarly, the cross-covariance between two weakly 
stationary processes 1s: 

yu, oe =) BLU Ch),y Cte)! (2.19) 
On the ergodic hypothesis, the time average equals the 
ensemble average. So, for a weakly stationary, ergodic 
process u(t), 
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Similarly, the cross-covariance can be defined as: 
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xl, yt) w =P Limet  Sult)eulter) dr (2224) 


Using Equation (2.20), the following properties for the 

autocorrelation function are evident: 

iertituisranweven functionsofur« 

Zee AU, UO, 0) ely (0, U, 400, v7 

sow mein tO) =2E1 0? (td doero? (Cu). 

4, If a signal has only random components, then y70 as 17, 

5. A given autocovariance may give rise to an infinite 
number of time functions, but any given function of time 
gives rise to only one autocovariance. 

Similarly, the cross-covariance has the following 

properties: 

PU Vo, 1) Ay Uy, eT) 

Zee, VT eh py TT) 

3. y is not necessarily maximum at 7T=0. 

The integral definitions in Equations (2.20) and (2.21) 

become immediately useful if we wish to examine the 

connection between these functions and the impulse response. 


Starting with the convolution integral: 
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where T is some fixed constant, multiplying by u(t) and 
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In discrete form: 
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That is, if we apply an input signal of autocovariance y(r), 
the impulse response can be obtained by deconvolution. In 
particular, white noise has autocovariance: 
PACUny do Me ermnic On ta) (2,25) 
Now Equation (2.23) becomes: 
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y(u;y,T) 
By adding white noise to the normal operating input signal, 
we obtain a cross-covariance proportional to the impulse 
response. In performing such a test, it is our choice 
whether to apply a continuous or periodic signal. Wellstead 
(1981) shows that, using a periodic test signal, with 
impulse-like autocovariance over the period, reduces the 
variability of the estimated impulse response. This is the 
motivation for the use of "pseudo-noise" test signals, 
particularly pseudo-random binary signals (PRBS). 

In a manner similar to the above development for 
impulse response analysis, we can examine the connection 
between frequency response analysis and correlation methods. 
In brief, we can estimate a system's frequency response from 
the autospectrum, S(u,u,w), and cross-spectrum, S({u,y,w) : 

S(uy Us W) mamGinlO) SAU 5G) (Oe O73.) 
where S(u,u,w) and S(u,y,w) are the Fourier transforms of 


yiu,uU,7) ana y(u,y,7) respectively. 
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Thus, correlation methods can form the basis for both 
impulse and frequency response analysis. Correlation methods 
have the advantage of noise immunity. Both impulse and 
frequency response analysis yield nonparametric models. 
These are generally useful in exposing system structure. 
However, parametric models are much more useful in 


summarizing or predicting behavior. 


2.7 Parameter Estimation Methods 

To apply a parametric method in system identification, 
a certain model structure must first be assumed. Then the 
model parameters are estimated by minimizing an error 
between model and process responses. For parametric 
identification methods, an iterative search must be used if 
the correct model structure is not known in advance. In 
contrast, nonparametric methods yield the final model 
directly, 

We presented a general input-output description of a 
linear process earlier: 


View oz oe ze (kee zee) ako 
Stiaae Ones TD) (2.4) 


The objective of the various methods is to estimate the 
parameters of the polynomials in Equation (2.4) from 
Miput-output data. In Table 221, the important properties 
and computational features of several methods are summarized 
(Isermann,(1980)). Performance comparisons have been based 
on simulations (Saridis,(1974)); (Isermann et al,(1974)) and 


analytical methods (Soderstrom et al,(1978)). 
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Modelling is generally an iteration through the 
following steps: 

1. Specify the application for the model. 

2. Use any a priori knowledge to specify general features 
of theysystem. (e.g. linearity, time constants, delays, 
etc. ) 

3. Choose an identification method. 

4, Select input signals, sampling time, length of 
experiment. 

5. Generate measurements. 

6. Identify model structure and estimate parameters. 

7. Test the candidate model. 

8. Accept the model, or change the experimental or model 
Structure. 

In addition to choosing an identification method, many other 

tasks have to be performed. The general procedure consists 

of several steps, requiring both an understanding of the 


process, and a strong theoretical background. 


2.8 Summary 
The purpose of this chapter was to provide the 
terminology and motivation for the developments to follow. 
First, we examined the various categories of models. 
These categories provide the referents for all further 
developments. In this work, we will be concerned with 


experimental, discrete stochastic models. 
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Two families of models were next examined. State space 
models were outlined, since they are a significant landmark 
in modelling. As demonstrated, they are closely related to 
input-output models. In the remainder of this work, we will 
deal only with input-output models. 

The difficult step in building input-output models is 
identifying their structure. We will use nonparametric 
methods to identify system structure, and then use parameter 
estimation within this structure. The nonparametric 
techniques used classically are frequency and impulse 
response analysis. However, neither analysis, in its 
Original form, deals adequately with noise. Their connection 
with correlation methods, which deal explicitly with noise, 
will be taken advantage of in the following chapters. 

This establishes the necessary background. Correlation 
methods will be used to take us through the difficult 
initial step of model structure identification. Once a 
tentative model structure has been established, nonlinear 
least squares estimation for an input-output model will be 
used. 

This method of attack is referred to by IsSermann as 
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3. BUILDING THE NOISE MODEL 

In this chapter, we will develop the foundations of the 
correlation - least squares method. First, we consider 
briefly the interpretation of the class of models selected. 
These models are based on two representations - the pure 
autoregressive and pure moving average forms. There are 
several interesting parallels between the two forms. One 
parallel appears during a close look at convergence 
conditions. Another appears as we consider identification 
tools for the two forms. The autocorrelation function is 
related to the moving average form; the partial 
autocorrelation is related to the autoregressive form. These 
two functions are defined, and their application is 
demonstrated in an example. One additional tool - the power 
spectrum - is also introduced, and applied in the same 
example. 

The class of models which we propose to use: 


y(k) = w(z7t) z-Pulk) + @(z-')a(k) 
5Cz- 4) o(z-') C328) 


consists of two parts. The first term describes the 
deterministic part of the plant. The second term describes 
the effects of stochastic signals which are always present, 
Passing thuoughrthe plant. The stochastic signals wiidebe 
referred to as noise. 

Identification of the noise model will be considered 
Sirrst® Thisewill peauwee several concepts and tools which 
will be developed further in modelling the plant transfer 
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The noise model subproblem is of considerable interest 
in itself, and falls within the general topic of time series 
analysis. To begin with, we should determine how to 
interpret the class of models: 


n Chog=86(0z°") a(k) 


g{z*} GGr2)) 
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Yule (1927) proposed that a series of observations which 
appear to follow some pattern can be regarded as having been 
generated from a series of independent shocks. That is, we 
can picture the process noise as having been generated from 
white noise passing through a linear filter. 

In Equation (3.2), we wrote the filter as a rational 
operator to economize on parameters, by using both past 
inputs and outputs. The pure input formulation: 

n(k) = #(z-*)a(k) | (3.3) 
where 

COU AR Ue spiny aN VF UR eee 
is referred to aS a moving average (MA) process. This is 
Somewhat of a misnomer, in that the weights do not 
necessarily sum to unity. 

The pure output formulation: 

Ties) ty Ks) ee= aa. (ky) (3.4) 
where 
jUNCCA me SP PT ARR rb ey aear § 8 PM ge ts cere te 


is referred to as an autoregressive (AR) process, since the 


output is being regressed on itself. 
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3.1 Stationarity and Invertibility 
Consider a process written in pure MA form: 

n (kK) RawtLigatk-j) prqsen0ee 28) so (385) 
where both a(k) and n(k) are zero mean. The variance of the 
process is defined as: 

a?(n)=e@e(n*(k)) (35.69 
Substituting for ntk)efrom@Equation (3.5). 

oe (no@=REZy7iaz (hog) 4 ja=e00e. 32; “ce C3070) 
Also, by definition: 

E@aetk))) “=. a7 (a) (3a) 
SubstitutingsEquation=(3:8) into Equation (3.7) and 
Simplifying: 


a?7(n) = o7(a)Zyw? 5 


ii} 
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™ 


i} oe Eh) 
which makes sense as long as the serieS Sum converges. 

So, for n(k) to have finite variance, #(z~"') must 
converge for |z~'|si. This is known as the stationarity 
condition. 

If we consider the pure AR process: 

Ihe@ze) nick) =a (k) C3710) 
then following a similar procedure as above, we obtain: 

o?(a)M=eo? (nn) 277 {z= ORR). ae (3.18) 
This says that for the observed series n(k) to have arisen 
from a series of finite variance, then II(z~') must converge 
for {z~'|<1. This is known as the invertibility condition. 

But since we have elected to use finite order 
polynomials, what is the motivation for considering these 


infinite order polynomials ? Via polynomial division, a 
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finite order moving average process can be represented as an 
infinite order autoregressive process. And a finite order AR 
process can likewise be represented as infinite order MA. 

So the stationarity condition that #(z-') converge, 
where: 

n(k) = #(z-')a(k) (3 22) 
is important for AR processes only. (A finite order moving 
average operator will always converge.) 

Conversely, the invertibility condition that II(z~‘) 
converge, where: 

Th ete) Ge) ca Ck) flats) 
is important only for MA processes. For example, consider 
the MA process of order one ( MA(1) ) 

nGjeaer la OZ) a Gk) Vora) 
where |6|<1. ThiS-can be represented as an infinite order AR 
process ( AR(@) ) 

Tia nk) a= ack) (Sn 
whereim, = —-0'’. The invertibility condition is satisfied; 
Givene || <1. 

If we consider representing the MA(q) process: 
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aS an autoregressive process: 

6(z-')n(k) = a(k) (3 eahod2) 
where the original polynomial can be factorized as: 
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whicheconvengessionly for |Ti |<theThe roots of 6(z7') must 
lie outside the unit circle for invertibility. 
If we consider representing the AR(p) process: 

(Vie on zee InGk a aealk) -fo= 4 ee op (3,20) 
aS a moving average process: 

pal ki) ies ombie(zyitya (ke) CS 2) 
with the original polynomial factorized as: 

Ghz ie) H=RIMUGiAPF zee (36122) 


then the inverted polynomial is: 


[toe aS eae (3273) 
which converges for |P,|< 1. The roots of ¢(z~‘') must lie 


outside the unit circle for stationarity. 


3.2 Further Remarks on Stationarity 

It should be stressed that stationarity implies that 
the process maintains statistical Ee prea 

Strict stationarity implies that the mean, variance and 
all higher order moments are constant. Stationarity of order 
k means that all moments up to order k are constant. Since 
only the first two moments are necessary to define a 
Gaussian distribution, assuming normality plus stationarity 
of order 2 implies strict stationarity. 

However, considering the typical changes in an 
industrial plant, it is perhaps unreasonable to assume that 
process disturbances will have a fixed mean. If we consider: 

(1 - oz-')n(k) = a(k) (3.24) 


with |@|< 1, the process is stable. With |¢| > 1, n(k) grows 
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exponentia Ply s Buta~m: £10 (=*1" Ithen: 
(MS iome) nik yt= alk) (3%2'5)) 
On: 


me) = m(k- 1) 


i) 
10) 
~~ 


(i372 6D) 
so that now we are concerned only with successive changes in 
n(k) purather than absolute values. 

That 1S, we model the changes in n(k). The observed 
process noise 1s considered the result of integrating these 
changes. 

If we define: 

Visi trsiz74 (3.2%) 
then Box and Jenkins (1976) refer to the model: 


Venki =e (z> eatk) 
Oza) (320) 


aS an autoregressive, integrated moving average model of 


order (p,d,q). This is abbreviated as ARIMA(p,d,q). 


3.3 Autocorrelation 

There are three basic tools used in ARIMA model 
identification. The first of these is the autocorrelation 
function. It is related to the less general concept of 


variance. Variance is defined as; 


oem e= CB bn? Ck] (3.29) 
where 

nk i=? fk an 

fi(k) = observed value 

mn = mean value 


Variance is estimated as: 
She = 1 /N) cose sole anN (3.30) 


The more general autocovariance is defined as: 
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y¥(j) = -Efn(k)n(k+3)] (3%3 0) 


Itlps best “estimated as: 


AUG) CANS Gi )n OF Sj aB1 Ber ee Pa eN= 5 80(3.53 2) 


Note that y(0) am 


To obtain the autocorrelation function, we normalize 
the autocovariance: 
Py) MS yO FOO) er 33) 
The autocorrelation of a process is thus independent of its 
variance. 
The autocorrelation function is estimated as: 
niepie=ec €5.)/c 00) Sn a) 
What does the autocorrelation indicate ? Consider the MA(q) 
process: 
nk t= 6( zen) atk) (sino) 
Substituting Equation (3.25) into Equation (3.31) we obtain: 


wa. se Cqyvattsq)) x 


veypeE= 65 CaGk ie Gt)a iis 
(a(k- 6(1)a(k—J=9)-. 20- 0a(k-3-G)1) 3.36) 


(k- 
Dis 
Formek = 0. we obtain: 

vOVME= met tO 2001) t ect Oat Qui) m0 aay) CS SHE, 
and for k # 0: 


eS oe G(Q— J) ONC)» Cl ca)) Mayme) antec ee Gl 


y(j) = 
( 0 , 3>q (3.38) 
We have used the fact that: 
(a? (a) te hie es 
Efatk)a@ksiy hw=4 (3.39) 
0 bre 2.40) 
\ 


to obtain these results. If we look at the autocorrelation 


function: 


p(j) = 0 5 Abc fe (3.40) 
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That is, for a pure moving average process, the theoretical 
autocorrelation function is zero beyond lag q. Since a 
finite order autoregressive process is equivalent to an 
MA() process, its theoretical autocorrelation function will 
be infinite in extent. 

In@practice,pwetwitll estimate ptk)afromeaaseniesr ot 
measurements. We will need some criterion for deciding 
whether p(k) is effectively zero beyond a given lag. The 
variance of the estimate r(k) for a stationary normal 
process, is given by Bartlett (1955) as: 

Cet Ge =e Niet pe (a) te pitt) peisg) 

~4ohe) pi iyo (isi omt2p2 Ga ) pte) sh (32449 
mE the sautocorrelationgisezerogionugkiziq: 
Coca =e Nuieaat 2opes ha) } =), .2.,09, 4) 22.9 (342) 
Substituting r(i) for p(i), we obtain the large lag standard 


error of the autocorrelation estimates. 


3.4 Partial Autocorrelation 
Since the autocorrelation reveals the moving average 
nature of a process, we might expect another tool to reveal 
the autoregressive nature of a process. This tool is the 
partial autocorrelation. 
Consider the AR(p) process: 
@(z-*)n(k) = alk) (3.43) 
where: 
OCzmie=s le BLOTS oe ioe year D (3.44) 


Changing the notation somewhat, let ¢(p,j) be the j'th 
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coefficient of an AR(p) process. We now have: 

PUsme ees 2¢(p,1)27 Stee ee eee (3.45) 
If we define the set of autoregression coefficients, 
{@(i,i)}, as the partial autocorrelation, 

$(a% i) wer Oamita>ap (3.46) 

for a pure AR(p) process. That is, for a pure autoregressive 
process, / the: partials autocorrelationscuts off forgi Pep. 
Since MA(q) is equivalent to AR(~), the theoretical partial 
autocorrelation for a pure moving average process is 
infinite in extent. 

There are two alternatives for estimating the partial 
autocorrelations. The first is based on a solution in terms 
of the autocorrelations, called the Yule-Walker equations. 
Consider the AR(i) process: 

pibkoepoCemten(k-1)+...+6(1,2)n(k-1)> + alk) (3.47) 


Multiplying by n(t-k) and taking expectations: 


ininesr lisa yijad) +. reted (igdevbneidagelde 0 (3.48) 
One 
waiver ol ied) oc jeidtee the (api) o(gpdd tran cee (3.49) 
Somthat: 
Picky). @ Ck) @ = 04 ke) (3.50) 
where 
1 p(1) 5 Olea) 
P(k) = |p(1) 1 pi kg2e 
HORI meee 1 
Aol = ees, Ome eee eis 
pment) sp (2). ... pCk)] 


These equations are solved for increasing values of k. 
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Durbin (1960) presents a recursive algorithm for solving 
Equations (3.50). 

However, the Yule-Walker equations are not well 
conditioned. A more stable alternative is to fit 
autoregressive models of increasing order to the data, via 
least Squares. 

Again, to determine whether a particular value is 
Significant, we need a test. Quenouille (1949) has shown 
that, assuming the process is AR(p), the estimated partial 
autocorrelations are approximately independently distributed 
£OLrgTEep.irhen: 

BECO1 id R= KAZN), .i 5 -Np Gar5 tr) 


where N is the number of observations. 


3.5 Power Spectrum 

Neither the autocorrelation nor the partial 
autocorrelation will reveal the presence of periodic 
components in the observations. The Fourier transform of the 
autocorrelation yields the power spectrum, which will show 
periodicities. In particular, a large peak in the low 
frequency range may indicate a slowly changing level or 
slope which may be removed by differencing. 

Given an odd number of obeservations, N = 2qt+1, we fit 
the Fourier series model: 
nikjmeta(0) +hE(alinchipkder6(itsli¢kietpetkite, 1=dee. Fig 
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The least squares estimates for a and 8 are: 


a0) a=aen 
A maeee/ Nin (kc ink) a =e (3.53) 
Dig eam C2N) Dn Ck) sla kaka meee N 


Now that the data has been transformed into the frequency 
domain, we must represent it somehow. The usual technique is 
to plot signal strength versus frequency. This is known as 
the signal spectrum. The signal strength, or intensity, at 
frequency f(i) is defined as: 
ORG eye ueUN 42 Ca Ai et eb (219) = 18 A 1 (3.54) 
For an even number of samples, N, the procedure is modified 
siiightly. Let N'= 2q, ‘and: 
= (1/N)E(-1)*n(k) k=1, ..., N 
SANG) (3.55) 
See) 2 Oi )eand er Gr (ir) befors12= ole ee eOomaSebpetores 
For a truly random series: 

n(k) = a(0) + e(k) | (355) 
That is, the signal will have only a DC component equal to 
the signal average, plus some error component causing it to 
varvepabout this value. In this case, the expected evalue: of 
1@£(i)) is 202(n), distributed as o?(n)x?(2). (The intensity 
of white noise is uniform at all frequencies, and equals 
202 (0) a) 

If there are periodic components in the series, the 


power spectrum will show an increase of intensity in the 


vicinity of the frequencies of these components. (See Figure 


Sealy) 
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Figure 3.1 Example Power Spectrum 
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If we integrate and normalize the spectrum, deviations 
from the expected behavior of white noise can be assessed 
via the Kolmogorov-Smirnov bounds. (See Figure 3.2) White 
noise has a uniform frequency content, so its integrated 
spectrum will be a straight line of slope 207. Normalizing 
by o”, we obtain a straight line of slope 2. Limit lines can 
be drawn at distances +K(e)//q above and below this line. 
For white noise, an excursion over the limit line will occur 
with probability e. On Figure 3.2, the lines nearest the 
central white noise line will be crossed with 10% 
probability. The lines further out would be crossed by white 
Horsewwithe peeprobabllity. sApproximate Wy # K00.01)e=ie.63, 


anaeK (0. 1 jm= 4. .2.2'. 


3.6 An Autoregressive Process 

In this section, we present an example of identifying a 
pure AR process. The advantage we have here over the real 
case is that we will actually specify the underlying 
process. We will generate a series of samples (termed a 
"realization" of the process), and then see how well our 
tools reveal it to us. We will examine both numerical and 
graphical representations, although we would normally rely 
on the graphical information. 

Consider the AR(2) process: 
Got 6zedt0.332°7)ntk) == ake rack) = oN(0;,0.05). 43557) 
which has the roots at z~'‘ = 1.5 and z~' = 2. If we generate 


a realization of this process using the initial values: 
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nt=euOesee2.25¢ nid)asra 2 (3.58) 
we obtain the results presented in Table 3.1. The 
autocorrelation and partial autocorrelation estimates, with 
confidence limits, are presented in Figures 3.3 and 3.4. 
Note that the estimated values in Table 3.1 compare quite 
well with the theoretical values for this process. The 
Significant results, however, are in the figures. The 
estimated autocorrelation is significant over a number of 
lags. This is as it should be for a pure autoregressive 
process. On the other hand, the partial autocorrelation cuts 
off after lag 2. This process would be tentatively, 


correctly, identified as AR(2). 


3.7 A Moving Average Process 
As for the AR(2) example, we will specify a process, 
generate a realization, and see how well our tools work. 
Consider the MA(2) process: 
rekder= (lepe05z2'FO0e62  2)a (kee ea( ky e=BN C071) (359) 
The numerical results are presented in Table 3.2. The 
estimated autocorrelation and partial autocorrelation are 
presenued inihigures 5.58anda3.¢. As@belorem itherestimares 
compare favourably with the theoretical values. The partial 
autocorrelation is significant over a number of lags, while 
the autocorrelation cuts off after lag 2. Again, this 


process would be properly identified as MA(2). 
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3.8 A Mixed Process 


Mixed processes are the norm in real applications, and 
represent the most difficult identification problems. Here, 
it 1s not simply a case of one function being continuous, 
and the other cutting off cleanly. Both judgement and 
experience are required. An additional complexity is 


introduced by the possibility of nonstationarity. 


Actual Estimate Actual Estimate 
Ori) + 0.872 Gon I O.Sre 0.818 
p(2) O.68.2 0.606 O22? eee rts 3 seats 
PS) 98 O. 502 0.429 GAi3i, 3) moO =07 0:18 


Table=3. 19 Resultsetor ARC) 


Actual Estimate 
re) 02689 0.643 
De) Or 2oZ Cane Oa 
p(3) 0.0 -0.039 


Table 3.2 Results for MA(2) 

To test our tools, we use a realization of the 
ARIMA(1,1,1) process: 
eee? cee VINGK) =n Gl—-OeoZs 1) ak) sya Ke = NO i) (Sts) 
This can be rearranged as: 

UE mOmonGk—1) +Ocen (k=a2) a (ka On Satikaal) (Sst) 
MremauLlocorrelabion plot (Figure gs. / )iSeSi oni ti can twOv.eiuea 
large number of lags. That this could be due to 
nonstationarity is confirmed by the large peak near DC in 


the power spectrum (Figure 3.8). 


az crtaniqnos: 
r¥e Pm Ay 


s2amised [susoA 
arG.0 st6,0.. (hy bles 
Elyu~ cé.i0- ($, S06 
‘Btds0- 0 (RE Mes i 
(c)eakix6? sefuzem te aide? 7 toe 


gtaesied.. aa 1 te a 


aback: Real Ga, 


: iad vide a ie 
4 a 4 we hae ae ae y < sale 
r£o.6 ses othe ae a 
2 [ se iA Pee 
eet CRD EA: rime 089 gine 2} hee 
7 hi wee ne a) oi (OR - 


(SHAM 207 stivsed a ‘¢-aiaer- 

efit io TObARSL LSID | A oa uth fehoos. eer 
pqessorg bi, ey 

(OBE). Uh TIM (ihe (aye(' ome 0-0) a tba ng 
sag) bSpnetiees set 

(taut) (7. AnBs OL} 2x65) ss nN we ila 
$ revo inaottingle @f (°.0 e1ugi) 260 nokag iavwoe 
oni ad one edd —— te ita 


oF 


| | | | 
| | | | 
“ Best 
Y) 
Li | | a | 
O | eg 
© ee 
re ce eee | 
eo eae 
a | et | | 
N co oie Rete a 
2 Fe eee 
< Pt ey 
Bee ray | 
oe I+} | 
= ee 
a a ee ee 
ee ee ae ee 
a 
teeta nl ined ge a es il sees cae ct 
oz°t = OorT,——(itiéiOS~*C«CC~*«‘é«i HO 020 ao°O = Oz" 0 


NOILV TadydOOOLNV 


20.00 24.00 28 -OC 


16.00 


Figure 3.3 AR(2) Autocorrelation 
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Figure 93.4 AR(2)*%Partial Autocorrelation 
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Figure 3.5 MA(2) Autocorrelation 
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Figure 3.6 MA(2) Partial Autocorrelation 
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Figure 3.7 Autocorrelation of Simulated ARIMA(1,1,1) 


= ® oa af ~# 
a ee Se. a are © i ae a 
“a 4 : a 
- 7 


F a 


= - - ; — -_-—— ++ 


mm * 


‘ 
ene ll A A A a ES ALLL EAL AAS 4 


; 


ee a 


42 


ARIMA (1,1,1) 
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Figure Be Power Spectrum of Simulated ARIMA(1,1,1) 
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After taking the.first difference of the data, both the 
autocorrelation and partial autocorrelation die out 
quickly. (Figures 3.9 and 3.10). 

This process would probably be initially identified as 
ARIMA(1,1,0), due to the behavior of the autocorrelation 
plot of the first difference. Diagnostics in the parameter 
estimation stage would later force an increase to 


ARIMACT 1 10", 


3.9 Summary 

The main purpose of this chapter has been to develop 
and present examples of the autocorrelation, partial 
autocorrelation and power spectrum. These are the tools with 
which the structure of the noise model is identified. 

Along with this, the concepts of stationarity and 
invertibility were presented. In the context of transfer 
function analysis, these correspond to the classical 
concepts of stability and minimum phase behavior. 

But the techniques presented here must be developed 
further to be useful. To actually identify the noise model 
and estimate its parameters, the noise Series must be 
available. This will only be possible if we have some means 
to partition our measurements of the plant output into 


deterministic and noise components. 
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Figure 3.9 Autocorrelation of Differenced Data from 


ARIMA(1,1,1) 
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4, TRANSFER FUNCTION PLUS NOISE MODELS 
In this chapter, we will extend the results of the previous 
chapter. The fundamental tools of the correlation method 
have been introduced. First, we will consider a 
generalization of autocovariance, called cross-covariance. 
This function aids us in exploring apparent causal 
relationships. 

The cross-covariance is of more utility, however, once 
we establish its connection to the impulse response of a 
system. This connection will be shown to be particularly 
direct when the input to the system is white noise. With the 
System impulse response available, we are able to partition 
Our measurements into deterministic and noise components. 
This puts uS into a position to identify both transfer 
function and noise model structure. 

The last step is the simplest. Given a candidate model, 
we estimate parameters by least squares. Since this 
procedure can have trouble when starting, we first consider 
how to obtain good initial guesses. Then we examine 
nonlinear least Squares in detail. 

Lastly, once the parameter estimates are available, we 
must have some criteria for deciding their acceptability. 
This is covered in the section on diagnostic checks. To 


conclude the chapter, we present an illustrative example. 
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4.1 Cross-correlation 
Recall the class of models being considered: 


yuk jmeamouzs)z Duk) =on(k) 
'Gzm' (4.1) 


One idea intrinsic to this class of models is that u and y 
are causally related. In other words, we are interested in 
the conjoint variation of these two variables. 

We define the cross-covariance from u to y as: 

uw = Bbuck)y(k+5 i) (Ame) 

where both u and y denote deviations from the mean. (Note 
that the autocovariance of y is yly,y,j).) 

In general, y(u,y,j)#y(y,u,j). But note: 
VUUny ji) EU oe) ya). T= muy pu@h ay bisa y( yu, — 9) @4e 3) 


In a manner similar to autocorrelation: 
p Cr pyinl) esi Gu i) k=O Pane, 
atuyoty) 


defipnes the cross-correlation function. The cross-covariance 


(4,4) 


is best estimated as: 


YAN ru (ly (king) kad, Bena ReN = apiOni pas: 
GUI Yaa ees 
[Gv Ey kd BO-3) kad expen). bjsret, —2, 


To estimate the cross-correlation: 


BURY oi) = clu, , i) 
Ye(u,u,0) ve(y,y,0) (4.6) 


As before, we will require a criterion for deciding whether 
a particular term in the cross-correlation function is 
significant. Bartiett (1955) shows that the approximate 
Standard error of a cross-correlation estimate is: 


auuatupyns LStNay)r tal kotu,u,i)ely,y,i)+e(u,y,j+i)p(u,y,j-1) 
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BCUe vps ny 1) 20/2) pau) est 2) 7h) 
=2p(u, yy no wuguel )oGuety , ee +p(u, Y,7 eed eon is) 


} 
] 
1 = 00! ce (4.7) 

Ef ufandsy tanewnot ¥crossscornelated, o(u, ywri)j=O07esojthat all 
but the first term in Equation (4.7) are zero: 
Garris yout )s) = (Nn jeep (UU, 1) OV yt) j=—c | (4.8) 
If u is white noise: 

PUG 7a) =O EO (4.9) 


and by definition: 


Pg 0) Geass 3) 9) (4.10) 
So on the hypothesis that u and y are not cross-correlated, 
and u is white noise: 


CACY york) ae ON aaa Casal) 


4.2 Input Prewhitening 
Suppose that after differencing the original series d 
times: | 
Vac eae CO ACK) Ar eV (ale) (Kit) et eset eet) GK) Benton ce) 
where the weights {v(i)} are the system impulse response. 
Multiplying by u(k-j), and taking expectations: 
unin paleeaie UO), jt VA oy Gun Use) 1) a tee remy. (15 0 eat) 
(4s) 
Assumingethat,uik-j)) is uncorrelated swith nk) forall 3: 
Sy Ne ay ce SAMO M A Oe ae VAC SCR oh Sa) Se gene (4,14) 
Beyond some point, the impulse response of the system will 
die out. Truncating Equation (4.14) at the point where the 
v(i) are effectively zero, and then writing the system of 


equations from k=0 to k=K: 
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yuy = Tuu V C4515) 
where: 
77 Compe 0 Livery Ou 5 uy 1d Tp Cis Blips: 
Tuu = ry Gus ey emery.(  tr..0)) yale) cal) 
7 CU Ural any te, jt) ie y(u,u,0) 


wuyssely(Cumy #0). sieyGusyekid) ev solv (Ojtractvca) | 
Now, by substituting estimates for y(u,u) and y(u,y), we can 
solve for the impulse weights. However, we can obtain the 
Same results more simply. Suppose that we represent the 
input u(k) as a time series: 
uCkin =) Ceuieza ) Jom zee) yaks a(k) =e N(0, a7 (a) (4.16) 
And model the output y(k) with the same filter, but a 
GUBierenGrinput: 

ViGK) = GNCG Ze t) /On Ze) 0 eK (4.17) 
Substituting for u(k) and y(k) in Equation (4.12) from 
Equations (4.16) and (4.17): 

BCkjm=av(z Ayaka tee(k) 

CAs) 2 (GiyaGra Boy ania” Ge) (4.18) 
Multiplying Equation (4.18) by a(k-j), taking expectations, 
and assuming no cross-correlation between a(k) and e(k): 

what peg) = vignia pha) (64449) 
which yields the estimate for the impulse response: 

Vil) Baa ay Cee cenay (4.20) 
This process of fitting a time series model to the input 
series, u(k), is called prewhitening. Obviously, if the 
inputeasewhitestoestart with; ithegimpulse responséiistjust a 


scaling of the cross-correlation. 
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4.3 Identification 
Once the impulse response has been arrived at, we can 
use 1t to estimate the noise series as: 
mGk)@=myik )@eSv.(z 20) uit ki) (421) 
Study of the autocorrelation, partial autocorrelation and 
power spectrum will yield the noise model structure. We can 
also use the impulse response to identify the structure of 
the transfer function. 
Recall the form of model being used: 
y(k)o= (w(27*)/6(299)) 2 ulk) (4n220 
where: 


QA Zia) 


iO!) Taito a) 2 
Oz ee) 


We 26.) za i 


"ou 
Ww 


and the definition of impulse response: 
VA) py, Comey UK) (4523) 


Equating Equations (4.22) and (4,.23) we obtain: 


SCZ er ze s)O Rk) Saaz i) z bak) (4.24) 
Egquating coefficients of equal powers of z~' in Equation 
(4.24); 
LA RAE esa TCS ee ere 
Ona wern= Dusen) 6 (jean oll ey eeh think, oopeic=be ..., bts 
fi A oS Da SS gmap ce 0) Joieeemi nent eee bts ty 


Ca25)) 
That is, the first b impulse response weights will be zero. 
And the weights v(i), i2b+s+1 will follow the homogeneous 
r'th order difference equation with initial conditions v(i), 
i=bt+s+i-r, ..., bts. The intermediate sequence of weights is 
dictated by a sequence of inhomogeneous r'th order 
Gatfecence equationsserinupantictlary vAb)Mrs thelhirst 


nonzero weight, so identification of b is straightforward. 
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Thegchotcepormsmea which sis normally elimiatedato 0, a1,..0r 
2, is less direct. The contrast in impulse response between 
processes of various orders is not always clear. Here, the 
step response weights are more easily interpreted. 

Let V(i) denote the i'th step response weight. Then: 

VAY PDA GO een Sein Was & (4.26) 
Without numerator dynamics (i.e. s = 0), the distinction in 
step response is quite clear. (See Figure 4.1) Zero'th order 
processes respond completely at the first step. First order 
processes follow a simple exponential curve. At least second 
order dynamics will be necessary to explain anything more 
complex. (overshoot with oscillation, etc.) 

The addition of numerator dynamics makes the issue less 
clear, by adding terms to the initial response. Zero'th 
order processes with numerator dynamics will show a linear 
inerease wupeto full response. But “1t7is very difficult to 
discern the presence of numerator dynamics in first or 
second order processes. By underfitting the model initially, 
and adding numerator dynamics only if the model is 


inadequate, satisfactory results will be obtained. 


4.4 Preliminary Parameter Estimates 
From the cross-correlation, we have an estimate of the 
impulse response: 
y(k) = v(z7')u(k) (eee. ) 
We wish to find a starting point for estimating the 


parameters in: 
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Figure 4.1 Representative Step Responses 
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y(k)g@es(aGzoe) /d(z2-%))zaeutk) (4.28) 
Equating Equations (4.27) and (4.28): 
i(zmvig hk) ubkd sho(zet)zioutke (4,29) 
Expanding, grouping terms, and equating coefficients of 
equal powers of z™' in Equation (4.29), we obtain the 
following preliminary estimates: 
No VRAIS deans 
2emelt *r> 0? Biindms (za) from: (4.30) 
Ad = V 
S=[S@l)ieye. son). Vely(btst1) Sev betsen) | 
Ley al) & aller Ss, abso 


0 jbaebtst2 
anagmmatrix Av usenxn. 


ey alGSR) ey Valey) (4,31) 
Vets >O merindsa 2 —:)ofrom: 
hoc aOe eb o 1 -GV (473 2)) 
where Berg = v(t ie 40) , sha 
0 ¢ jr 
LGGr=0  SOwetzv. fie, Sey) 


The noise model preliminary estimates are obtained as for 
time series models. (See Appendix A.) 

It should be kept in mind that these estimates are 
extremely inefficient. In cases where the objective function 
(sum of squared errors) is very large at the starting point, 
it may be useful to start with all parameters set toa 


small, nonzero value (e.g. 10°°). 


4.5 Nonlinear Least Squares 
In general terms, our problem is to minimize: 
f(x) = Q'(x,k)Q(x,k) (4.34) 


where © is a function that produces residuals at the 
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observation times, given the parameter vector x. The 
residuals are the difference between the model prediction 
based on parameters x, and the actual observation. f(x) is 
our objective function. The gradient is; 
Vig=020)0 (4.35) 
where J, the Jacobian, is: 
KaOexn CL yl FOR Skene (007 ax (mg). |4 
ae (4.36) 
(30 / ox Gait ewes. OF (00 Zax tm i, 
From this point on, let subscripts denote the stage of 
iteration. Approximate Equation (4.35) as: 
ify = 2d bone, | (4.37) 
and approximate Q;., by the Taylor series: 
ON =O ppm ete ka) (4.38) 
Substucuringutcuatron (4.38) s1nto (4737 )e 
Tie y=2d WOR (see Re 20 Oy (4.39) 
At the minimum, a necessary condition is: 
Vice =n (4.40) 
Substituting Equation (4.40) into (4.39): 
int a Seen aR Is); (4,41) 


Whichwis Gauss" method of least, squares. 


4.5.1 Levenburg - Marquardt correction 

To ensure that the matrix inversion required by 
Equation (4.41) is possible, it is sufficient that the 
matrix J'J be positive definite. Scale g's as: 


l= Cota voc" (4,42) 
where: 
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La Ge ee tea es (4.43) 
Positive definiteness can be guaranteed if we let: 

er GAloe Sea Sea res: 

Pou eel oer NC (4,44) 
where A 1S a positive constant such that A > -min{a;}. Here, 
the a; are the eigenvalues of II. That is, A guarantees 
invertibility by moving the eigenvalues of J'J into the 
right half plane. Substituting Equation (4.44) into (4.41): 

Kee xen Oy ena ato OF (4.45) 
We can further stabilize the algorithm by choosing: 
eet Poe Coen Contd |) J Cai suet lis Ga Tn, (4.46) 
AS we increase A, AC? dominates af Si and the procedure looks 
more like steepest descent: 

ie eee Kae li, 2) Cat AL (4.47) 
But also note that increasing > decreases the step Size. AS 
the minimum is approached, A should approach zero, to return 
us to Gauss' method. Thus, let: 

Mee = Ae (4.48) 
where pvp is a constant greater than 1. For the transfer 
function plus noise problem, the residuals a(k) are 
calculated in three steps: 

(ey kmermcz 9) /5(z-"))) zee (la) (4.49) 
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The algorithm will become unstable if there are 
redundant parameters. A redundant parameter will result in 
one column of the Jacobian being linearly dependent on the 
others. Thus, the matrix will become singular, and cannot be 


inverted. 


4.6 Diagnostic Checks 

Once a particular model has been fitted, its adequacy 
must be checked. Most diagnostics are based on the model 
residuals. The following five checks should be performed: 

1. The input, u(k), should not be cross-correlated with the 
residuals. If it is, there is a "residual" relationship 
between input and output which has not been explained by 
the model. 

2. The model residuals should not be autocorrelated. If the 
residuals are autocorrelated, but not cross-correlated 
with the output, y(k), then there is some difficulty in 
the noise model which induces this auotcorrelation. 

3. We can also test the first few terms of the 
cross-=correlation, r(a,a,k), or thesautocorrelation; 
rla,ajyk). This will be a chi-square-test, where: 

Maney Sexes? 7072", El x2 Cp) = sp (4.52) 
and: 
yp = number of degrees of freedom 
~ observed variance 
Oo?) = true variance 
a. The statistic: 


SaeeNir 4 (asa, i) ep N=(n-emaxGstbtp sp )s) (4.53) 
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will be distributed as x?(j-(r+s)), where: 


n = number of observations 
p' = prewhitening transform autoregressive degree 
j} = number of cross-correlations to be considered 


DomelthenStatistyes 
Oma Nin Sayan), N= (n=(Stbtp)) (4,54) 

will be distributed as x?(j-(ptq)). 
If either statistic is much larger than its expected value, 
theycormrespondingscorrelation ( r(a,a,k) on rlay,a,k) ) 
should be viewed with suspicion. 
4, The uncertainty interval for a parameter should not 

include zero. 
5. The power spectrum of the residuals should be white. 
If all these checks are passed, the model can be considered 
a good candidate. Given more than one candidate model, that 


with fewest parameters should be chosen. 


4.7 Stirred Tank Heater 

As an example, we will consider modelling a stirred 
tank heater. (See Figure 4.2) Our objective is to find the 
transfer function plus noise model for outlet temperature 
response to steam flow. 

To obtain the input-output data, an open-loop test must 
be performed. This requires that we perturb the input, and 
sample the plant output. 
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Figure 4.2 Stirred Tank Heater 
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it must be rich enough to excite all the modes of interest. 
This criterion is satisfied by a white noise sequence. The 
most common white noise generator is the central limit 
theorem generator. 
Approximately normal random variables can be generated 

Prom: 

[Ss GP AOR ess aye an A oes (4.55) 
where: 

ogtphe=alygErpD =K6 
and n is uniformly distributed in [0,1]. Uniform 


distributions are usually achieved by congruential schemes. 


For example, 


ny. eas.ch;(modulofm) (4.56) 
where: no is odd m=2°¢+1 
Ce=VSt+3 t some integer 


will yield 2°-? values in the range [0, m-1] before 
repeating. We could use: 


65539 = 8*8192 + 3 
234 +04 


Cc 
m 


Given an input signal, we must also decide on the sampling 
rate, and sample size. Sampling rate is determined by the 
need to recognize high frequency components. The highest 
frequency that can be recognized is determined by the 
Sampling theorem to be one half the sampling frequency. 
Sample size is determined by the need to recognize low 
frequency components. As a rule of thumb, estimates of lag 
components of more than 20% - 30% of the data should 
probably be avoided. The least constraining choice is to 


Sample as quickly as possible for as long as possible, while 
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acknowledging these caveats: 

1. Very large data sets make statistical inferences 
meaningless because of the explosion of degrees of 
freedom. 

2. The variance of a parameter estimate is roughly 
proportional to the inverse square root of the number of 
samples. 

3. Data collection can be a very expensive process. 

In the present example, a central limit theorem generator 

was used to build up a steam valve position input with mean 

50% and variance 20%. The sampling interval selected was 5 

seconds (0.2 Hz). A total of 200 points were gathered. (See 

Appendix B for a listing of the data.) This scheme is 

probably useful for frequencies from 0.01 Hz to 0.1 Hz. The 

test results are presented in Figures 4.3 - 4.6. 

Firstly, the cross-correlation (Figure 4.3) appears 
reasonable, in that an increase in steam flow is correlated 
with an increase in temperature. Note that the 
G€vess-cocrelation at lag | is jJUuSt signiticante=soethat Db = 
i 

The step response (Figure 4.4) shows a process with 
overshoot, so that at least second order dynamics are 
required.(i.e. r = 2) No guess aS to numerator dynamics can 
be made at this point. The autocorrelation of the noise 
estimate (Figure 4.5) dies out rather slowly. Inspecting the 
noise partial autocorrelation (Figure 4.6), it appears we 


have a largely autoregressive process (p = 1), but perhaps 
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Figure 4.3.Process Cross-correlation 
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Figure 4.5 Noise Autocorrelation 
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NOISE ESTIMATE 
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Figure 4.6 Noise Partial Autocorrelation 
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with some moving average character. 

In this case, the tentative transfer function plus 
norse*model#@has. orders® of (r,s) b)C=84 ppd gq) fequal-to (2P0} 1) 
=e (a), OF 0)e 

We find eventually that the diagnostic checks are 
Satisfied only with the addition of numerator dynamics, plus 
one moving average parameter in the noise model. The best 
model 1s “of ordersa(2,1,1)\@= (1,0, 1%: 

The plots of parameter value versus iteration number 
(Figures 4.7 and 4.8) show smooth convergence. 

The autocorrelation plot of the residuals (Figure 4.9) 
shows no significant terms. And the residual power spectrum 
(Figure 4.10) lies within the bounds for white noise. 

The cross-correlation between input and residuals 
(Figure 4.11) Shows one significant term, but this is at a 
very large lag, so it can be discounted. 

The parameter values obtained are given in Table 4.1. 
Thus, the model for this case is fairly straightforward to 
obtain. It should be emphasized that even for this problem, 
building the model required an iteration, from 
identification and estimation, through diagnostic checking 


and correction, thence back to estimation. 


4.8 Summary 


Trethesechapter, OUL firsts Stepawas, LOsUeianestiae 
cross-correlation function. Both the cross-correlation 


function and impulse response are nonparametric models of a 
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System. Perhaps, then, it is not surprising that the two are 
closely related. In the simplest case of white noise input 
to a system, the impulse response is just a scaling of the 
cross-correlation. Even when the input is not white, there 
is an orthogonal set of equations relating cross-correlation 


to impulse response. 


om 62 Wo W 4 om 6 
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Table 4.1 Results of Parameter Estimation 

This relationship is important, since we can determine 
the cross-correlation function from experimental data. Thus, 
we can estimate the impulse response. Finally, from the 
impulse response, we can identify the structure of the 
transfer function. 

With both the transfer function and noise model 
identified, we can attempt to estimate a set of parameters. 
Before doing so, we may want to generate a set of 
preliminary estimates as a starting point for the least 
Squares routine. 

With the parameter estimates in hand, we run through a 
Series Of diagnostic checks. If alle checks eareypassed, swe 
have a good candidate model. If the checks are not passed, 
we Oaity the model structure accordingly, and estimate a 
new parameter set. 

In developing the transfer function plus noise model, 


we go through an alternating series of steps. On one hand, 
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Figure 4.7 Fitting of Transfer Function Parameters 
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Figure 4.8 Fitting of Noise Model Parameters 
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Figure 4.9 Residual Autocorrelation . 
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Figure 4.10 Residual Cumulative Spectrum 
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Figure 4.11 Residual Cross-correlation with Input 
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we have a set of strictly analytical procedures(calculate 
and display autocorrelation, etc.). These steps can easily 
be automated. On the other hand, we have a set of pattern 
recognition steps. (e.g. Does the step response show first 
Or second order dynamics ?) These are less easily automated. 
The library implemented by the author relies on rich, 
well-defined graphical output, to allow the modeller to 


accomplish the pattern identification steps. 
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5. MINIMUM VARIANCE CONTROL 
The previous chapters have assembled the background 
necessary for modelling stochastic plants, via one specific 
method. This is perhaps useful in itself, but if we can 
predict plant behavior, then we should be able to obtain 
good®control* 
Consider once more the model: 
UR o S806 (29 )75(z-9) ) zack Seba tk pus ie (Saw o0z- th) atk) 
(521) 
where: 
y(k) = deviation from target 
u(k) = deviation from value which holds y(k) at zero 
f = pure process delay 
Our problem is to derive a feedback controller: 

UCKPA=OE (CY Ck) (552) 
which minimizes the variance in y(k). For example, consider 
the case: 
yeky =00(0)uGk-1)4n(k) NCR) =Cl=62° WAG) satkye C5e3) 
For the noise to have no effect at the output, y(k) = 0. 
Then Equation (5.3) becomes: 

“ek t=" G-1/000 ). in (kta) (554) 
To make the control adjustment, we must somehow predict 
n(k+1). From Equation (5.3): 

(ky BS. (120292) /(1=ézat yale) Uoe5) 
which can be rewritten: 

n(k+1) t=egn(k) #talk+t)e= @a(k) (536) 
Taking expectations at instant k we are left with: 


A(k+1) = @n(k) - @a(k) Sie 2) 
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where we have used the fact that, at instant k, Efa(k+1)} = 
0. Substitutingyfordn(k+1); in»Equation (5.4): 

u(k) ep(et/o(0))(én(kisGatk)) (5.8) 
Now, the best that this controller can do is: 

y(k+1) = a(k+1) (529) 
Since the error in predicting n(k+1) will show up in the 
output. That is, the minimum variance we can accomplish in 
this case is o7(y) = o7(a). Substituting Equation (5.9) into 
(5.8), and using the noise model to substitute for n(k) in 
terms of a(k): 

U Ghee es h/Ob0) 946-6) /(1-dz>') y{k) Prats LEO) 
which gives the control output in terms of the current plant 
output. Here, we have used the transfer function plus noise 
model to design a controller which will reject noise. In 
this derivation, we have implicitly assumed the desired 
process output (setpoint) to be zero. That is, we desire 
zero deviation from the steady-state value at which the 
model was developed. (Recall that both y(k) and u(k) 
represent deviations from steady-state.) Let us consider the 
case of a non-zero setpoint, say y(k) = ysp(k). Now, if 
notsemhas noset fectp_y Uk) e=eyspGk ke. ESubstitubingsthis into 
Equation (5.3) and rearranging: 

uch) e= Gysp, an/4olO) e-eGat eb) 7080) O54 1) 
The second term on the right will lead to the same result as 
before, so that the regulatory plus servo control law is: 
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5.1 Analytical Derivation 

When we derived the minimum variance controller for the 
first example, we took the following steps: 

Vee Set y(k) = 0, £0 obtainithe’control@law from the 
transfer function plus noise model. 

2. From the noise model, obtain the estimates n(k+i) 
required in the control law. 

3. Substitute these estimates to obtain the control law in 
terms of a(k), u(k)=g(a(k)). 

4, Obtain the relationship between y(k) and a(k) determined 
bya Ener predlcrei onserrors. 

5. Use this relationship to obtain the control law in terms 
of measurements y(k), instead of the noise model inputs, 
a(k). 

This procedure may be entirely acceptable. However, we 

require a constructive proof to show that the controller 

will give minimum variance. 

To begin, we require an expression for the variance of 
the plant output. We will then use analytic methods to 
derive the minimum variance control law. 

Note that above, we were able to express y(k) in terms 
of a(k) alone. This should lead us to expect that the 
variance of y(k) should be expressible in terms of the 
variance of a(k). 

Recall the original model: 
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By definition, the transfer function is just an economical 
parameterization of the impulse response, v(z~'), where: 
VAZEANS=RLVY 2 duse(oleniyy 5 tzu) (5.14) 
9 09 verses 
Substituting Equation (5.14) into (5.13): 
va ofa en =av (Zas'eu (Keke (6 (2M) /bo (ee) al keira (5.15) 
Recalling the third step in our heuristic procedure above, 
we should be able to express the control action in terms of 
the noise model inputs: 
UK) meee) AK — 1 eee ti za!) a Oke) (SG) 
B- SHORT. , © 
Substituting Equation! (5296) finto*(5015) ?*we* obtain yk) 
entirely in terms of a(k): 
yik+itad sev(zoteu(zeeya(k) c+ehe (pou) sole: th) avka£+1) 
(Sma7)) 
We saw earlier that it is possible to compensate for noise 
inputS occurring up to and including instant k. Certainly, 
all the terms in the first expression on the right side of 
Equation (5.17) fall into this category. We will have to 
cast the second term in a Similar form to examine it more 
closely. Define the impulse response of the noise model: 
wihz;  Me=eLy Li zee SeONzUby ota YP) (5.18) 
DPe=80 a. Thee 
Now separate this into two parts: 
UGzayam Wy 2 ' taza teeny (ze) (5.19) 
Sotbhat Ww, (ze') tistof degréeszt, fand z-(F+1y,(z-') is the 


remaining high order part of w(z~'). Then: 
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W(z>' Ya(k+£+1) = W,(2-' )a(k+£+1) + w2(z-' alk) (5.20) 
So W,(z~') operates on disturbances which have not yet 
arisen at instant k. Now, substituting Equation (5.19) into 


(Sxntiza)i: 


yolk bine =3 Hiv 2 ‘e)Gzr) )e te ler ‘dekak) chew iz “re) ealikeet B10 


(Sie2alp) 
Squaring: 
Vea Koettal,) Gp lv Gasainrecr &)+We (zo) J 2 atk) 
awa crm) e# a2 kee fiers) 
(tA zea Cae D) yoaCze™ ) 
ya ze Pe AGzeg) ja Gk a (kt iede 
(oe) 
and taking expected values, we obtain: 
OC VUE vate Gis it or et a eet ose) (5123) 


As required, we now have an expression for the variance of 
y(k). As mentioned above, it was possible to obtain this 
entirely in terms of a(k). And, as promised, we will now 
proceed to minimize this variance analytically. 

First, we require one simple result. We defined the 
noise model impulse response as w(z~'). That is: 
Giezaamin)y Gz meee Zee), WZ) il eZ ee tee. (5.24) 
Supstitucingeerom Equation (5.119): 
ezeiwi/ dG (ze a amin (22) tezc Ry at zat) (5.25) 
At some point, we will have to determine wW,(z~') and 
w2(z~'). This can be looked at as a problem in polynomial 
division, where W,(z~') gives the remainder terms. That is, 
we can define: 
eu zene zit) fo (z- & oe, WT Cz ai ea: 00 eT at!) 2 on oan 
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We will need this result later. Let us pick up where we 
left off, in the minimization. The second sum in Equation 
(5.23) is independent of our actions. Only the first sum 
contains terms related to the control law. (This should not 
be Surprising - recall that the second sum is related to 
as-yet unrealized disturbances.) So it will be sufficient to 
minimize the first sum. The minimum for a squared real value 


is zero, SO we require: 


Bi vaio) Dad) Pet Wo)? =20 ite ae) 
re er ata Ceatet VN Cra eR) PSE sean ey ea” °c 8) 
This will be satisified using the control law: 
DUZaee) ene Zoe vi Zee) Meee sp 


This 1S a reasonable result. At the plant output, we could 
negate the disturbances with -W2(z~'). We move this action 
to the plant input by multiplying by the inverse of the 
plant transfer function. 

Substituting Bquation (5.28) eintowet5 167. 

UWCky eae, (zs '#) /v (cee ea Ck) (Seo) 

Substitucingmior.viz 29) of romeEquataons (5.14) >, and for 
We izes)eercomerquation. (5.26): 

Ul kye=een Co (zo) T(z) / (ole 4) 6(cs2)) walk) (5.30) 
Now we are at a stage similar to the fourth step in the 
heuristic procedure. We have u(k) as a function of a(k), and 
would like u(k) as a function of y(k). Using the impulse 
response w(z~') in the original model, we have: 
Vinee (olzare)/o( 20) uk faye ay Cz al ky (5a) 
Substituting for w(z~') from Equation (5.19), for w2(z"') 


from Equation (5.26), and for u(k) from the control law, 
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Equation (5.30); 

yk) = (ze val(k) (eS2) 
AS expected, the deviations are caused by the as-yet 
unrealized disturbances. Rearranging Equation (5.32) for 
a(k), and substituting into the control law, Equation 
Corrs Oily: 
mG) Meas MO CZIw) Pi2ae) 97 (olze )o(z- Wy (se )) vik Ses) 
This is the minimum variance control law, written in terms 


of plant input and output. 


5.2 Nonminimum Phase Systems 

Tie az s)ehas zeroes on or inside thesunube circle setne 
control law in Equation (5.33) will require infinite 
variance for the input u(k). Even if w(z~') is invertible, 
experience has shown that unacceptably large input variance 
may be called for. One solution is to add a constraint to 
the minimization. That is, minimize the output variance, 
o?(y), subject to an upper bound on the input variance, 
C00) For thesproblen: 

Mineo uyymSet. 07 CU jecumia) (5.34) 
the method of Lagrangian multipliers leads to a minimization 
OL; 

FuCUyiyae Noe 0 CY) aC aU) BNC cil) (5.35) 
Since we cannot effect o7(a), this function will be minimum 
tei 

G (iy, Aen=ino 07) eee OG) (5.36) 
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u(kimaneliaia (ksi) e=dLtz-dlakk) C5e16) 
Lies Or aemrer ne 
the input variance can be expressed as: 
Gb) M82 ao? (a) , 9 10=) 0,72 ee C5e3 79 
Substituting from Equations (5.23) and (5.37) into Equation 
Se 6 Je 


aes (5738) 


Again, the second sum is independent of the control law, so 
we need only to minimize: 


CUlny, Wes atl tly ani kh) VCroK))) 2 tN ay) toa (ems 9) 
1 P2304 oto eoke07 ¢ hentai 


where the first Summation has simply been rearranged. 
Differentiating, and since the first j terms are independent 


Substituting Equation (5.240) into (5.38), differentiating 
the second term, and equating to zero: 

CyeGitesy) Werlerxl (j)g= OiRpeo eie=5, Leati= (5.41) 
where: 

Wt =a a (al) PEO LE Choy Glia) ey) k=O a (5). 42) 
To get from the set of Equations (5.41) to the final form 
for the controller requires three somewhat involved steps. 
First, we will rewrite Equations (5.41) in operator form. 
Then we will perform a spectral factorization for the sum of 
two polynomials. Lastly, we use partial fractions to 


rearrange the equation. The reader may wish to leave these 
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steps for a second reading, and go directly to Equation 
Co.53) 

This first manipulation is a rather elegant approach, 
due to Wilson (1969a). 

The equations for the L; can be solved much more easily 
if we can pass back to an operator notation. Consider: 

VAZIWCAH" eet EAE Ga* 1); =(H Gz) (5.43) 
where: 

Héz) tf]. 2H Gi jz! le Ee Weleda ey Be 

WZ ee wT Zameen et mie Oz a) 

Expanding the left side, and equating coefficients of equal 
powers of z, we find that H, = 0, is0, is required to 
Satisfy Equations (5.41) The coefficients of H;, i > 0, are 
not constrained by Equation (5.41), and will be left free. 
That is, we can embed the series problem of Equation (5.41) 
within the operator problem of Equation (5.43). Restating, 
our original definition was: 

H(z) v=) ZHC1)z' ik ee — Coase ne, co (5.44) 
but Equations (5.41) constrain this to: 

H(2g= DER) 2. boastank. sere 65.645) 
which is a power series in z, with a zero leading 
coefficient. Substituting from Equation (5.44) into (5.43): 
VA iInGt wi) Vuze) et We (Zo 5) ieee Nees) ae (5.46) 
SupScrcucing tory (2 |) anc iWo(2n Jelrom Equations mio-d4) 
Bvate) (Gig ly) E 


Nb (Zz) 6 (ze OCs (2) er nz) 


+ 
jez d (zm) S(z)o(z-*) (5547) 
We now have two terms on the left, in various polynomials of 


z and z~'. What we are after is a solution for L(z~'). What 
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we require, then, is some way to disentangle the polynomials 
ineze “etvrometnoseminucz. Let's startewithethesti ist termmon 
the left. The spectral factorization theorem will let us 
perform the factorization as: 

GZ) OUZas) MtmeNG Uz) Oz oS) a aey (zy zene) (5.48) 
where y(z~') has no zeroes inside or on the unit circle. 
Further detail can be found in Wilson (1969b). Substituting 


Boomenquation (506 wernt o. (5.47 )% 


BZ ae) ez oz a) et OZ) Re = Ce) 
8(z)é(z-') 8(z)o(z-') (5.49) 


Since y(z) is invertible, we can multiply by 6(z)/y7(z) 


lL. (get iy 02 28) tatoo Gait (ze) = Hatz) 
SiGe) EAT Rayo) (5.50) 


where, Similar to H(z),: 

HO(z) =" 2Hyz) DECEU Uy tee 
Now, the first term is all in z~‘'. Separating the second 
term by partial fractions: 


40(z wn zee) 2207 ¢zas pertor (2) 
vy(z)o(z~') o(z-') y(z) (5.51) 


where Q2(z) must have a zero constant term for uniqueness. 


SUpSt1LutLIng ssromenquation (55,5 jin conto. 50). 


Cen) ize On Se = BOa Zee er zy) 
(zie) g(z-*) zy (3.2) 
When expanded, the right side is a power series in z, with 
no constant term. The left side iS a power Series in z™'. 


Thus, the equation is separable, and: 


Maze!) = 6(2°") On Zeny) 
y(z-‘')o(z-') (5e5s)) 


Substituting Equation (5.53) into (5.16): 
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Uk mew (Zi) O .Czeatk) 
viz aoe (zed) (5.54) 
The interpretation of this equation is much less clear than 
for Equation (5.29). Recall the associated Equation (5.30): 
CK ema CO Zemin 2%) (co 2 all) ol cunts) make) (5730) 
Looking at Equation (5.51), we note that Q,(z-‘') is closely 
related to T(z~'). The major change from Equation (5.30) to 
(5.54) has been the replacement of w(z-') by y(z°'). This is 
reasonable, since we were having problems with zeroes of 
w(z-') inside or on the unit circle. We can guarantee that 
y(z~') will have all roots outside the unit circle. Once 
again, we are in the position of having a control law in 
Serms ofr ack )<aSolving Equation: (5.54) iforea(k), 
Sunset rtut mngmintomequation (Seal); wandaleamrangangmrorsu(k): 


UGki)in= §€z-* )O 


1 y (k) 

MeA@ OVO. nese Waa hy i zs) OL zai) (5.55) 
which is the tinal form for this controller. In this 
derivation, we have dealt directly with two common problems. 
First, non-minimum phase behavior has been accounted for. 


And excessive manipulation of the control input can be taken 


care of by the same approach. 


5.3 Example 
If we use the subroutine library to derive the 
unconstrained controller for the system modelled in section 


4.7, we obtain: 


Uke 2 698 Cas eel Sn zie ee ee 68) yk) 
Cat eae = Ss) (5256) 


Cancelling the approximate pair near the circle: 
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u(k) = -39.74y(k) + 2.698y(k-1) - u(k-1) 
jemi 2 (5.57) 


Applying this controller, we obtain the results shown in 
Figure 5.1. Note that the transfer function has also been 
used to derive a servo controller, while variance from the 
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560.00 


6. SUMMARY AND OBSERVATIONS 
This thesis was motivated, in the Introduction, by a 
consideration of the effects of improved regulatory control. 
This was followed by a consideration of the categories of 
models - in particular, state space and input-output models. 


Using the state space description 


x(k+4)esro(ktipk)x€ki)jeteb(kse1, k)utk) mt Ph (kt1¢k) wk) 
y(k+?) = H(k+1)x(k+1) a) 
eikeapl =aydk+ 1) bt8v (kot) 


was stated to be equivalent to using the input-output 
description; 


yik) =nBbze ard (kjetacé(zth) Gath) 
em) izes) (263) 


aS a consequence of applying the Kalman filter theorem. This 
was generalized slightly to obtain: 


y(k) = wl(z-') zPul(k) + 6(z-*')al(k) 
Sizeun) ol za) 0284) 


where the polynomials are defined as: 


Giza er=ecot'0.) Sze. (it) Ze a i= aS eas 
Shc VNR=Ties Lodi) zine orsssj CARY W285) 
O07 ees) Mia ema Oi) Zam lh sate 
gpize )M=) iSoerela) 2 pm= ite 5 ae 


Then two approaches to developing such models - via 
frequency and impulse response methods - were defined. 
Basically, the difference between the two approaches is the 
domain of representation selected - frequency, or time. 
Next, the convolution integral was used to show that the 
correlation method is one way to perform impulse response 
analysis. It was also shown that the auto and 
cross-spectrum, used in frequency response analysis, are 


just Fourier transforms of their time-domain equivalents, 
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the auto and cross-correlation. Chapter 2 finished with an 
introduction of the path from nonparametric representations 
(e.g. impulse response), to parametric representations (e.g. 
transfer function). The iterative nature of the modelling 
process was emphasized, to ensure that the method to be 
presented would not be perceived as some sort of panacea. 
The observation was made that three major steps must be 
accomplished. First, we identify a candidate model 
structure. Then we eStimate parameters for this structure. 
After evaluating the results, either the model is accepted, 
or we make a change and iterate through the process again. 

In Chapter 3, the question of stability was considered, 
in conjunction with the noise model: 


ne) @=96 (zi 8) alk) 


ELT, (637.20) 
where 
CCZ mae) a= tel Oe BC ge aren Kel 
OM Ga OP FE See ra ive ea 
aiekyes= N00 2.Cay)s) 


First, it was observed that the autoregressive (AR) and 
moving average (MA) operators are in some sense 
complementary. A finite order MA operator can be represented 
as an infinite order AR operator, and vice verSa. 
Stationarity is defined as the convergence of an MA 
operator. Invertibility is defined as convergence of an AR 
operator. For finite order polynomial operators with bounded 
coefficients, convergence of the sum of coefficients is 
guaranteed. The issues of invertibility and stationarity 
arise only in the case of infinite order operators. So 


Stationarity is important for infinite order MA operators, 
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that is, for finite order AR operators. And invertibility is 
important for infinite order AR operators, or finite order 
MA operators. In either case, the roots of the operator 

O( Zit) POD olzal must Tie outside@thesunita@cirele) tor 
Stability of the noise model. 

Following this, the three basic tools for noise model 
identification were introduced. The autocovariance is 
related to the idea of variance. It is defined as: 

y(j) = Efn(k)n(ktj)] (3pen9 
and iS estimated as: 

¢ Oj) S= et ING TH nities )OUIS S81 he FONE Ss Ones 532) 
To remove the dependance of this measure on the process 
variance, we normalize to obtain the autocorrelation: 

Lea e=te(4yy7e(e) (3.34) 
The autocorrelation will indicate the order of a pure MA 
process. 

To determine the order of a pure AR process, the 
partial autocorrelation was defined. This measure is 
somewhat more of a made-to-order device, since the partial 
autocorrelation coefficients are actually best derived by 
fitting pure AR models of increasing degree to the data. 

Neither the autocorrelation or partial autocorrelation 
will reveal the presence of periodicities in the data. This 
type of behavior can be investigated with the power spectrum 
OfPehetdata. The procedure here is to fit a Fourier series 
model: 
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where: 
als Git | 
s(i,k) 


cos(2mf(i)k) 
Bant2rids ) kor fda oan (afND 


and then to represent the intensity at each frequency: 

BGe (by) deetN/20da> (Gdetab? 4 )dodd= lie. taeng (3.54) 
graphically, versus frequency. If there are periodic 
components in the series, the power spectrum will show an 
increase of intensity in the vicinity of the frequencies of 
these components. 

Following examples of the use of these tools for pure 
and mixed processes, it was then noted that to identify the 
noise model (i.e. to actually calculate the functions 
above), we must be able to partition the observations 
between the transfer function and noise model. 

Continuing to work with nonparametric system 
representations in Chapter 4, the connection between 
cross-correlation and impulse response was examined in some 
detail. It was shown that the cross-correlation from input 
to output is the convolution of the autocorrelation of the 
input signal with the system impulse response. Given that we 
have an input-output series, deconvolution can be used to 
obtain an estimate of the impulse response. This will not 
necessarily be an efficient estimate, but it will allow 
identification of the noise model structure. A type of 
deconvolution called prewhitening was examined in this 
EESDECUS 

With an estimate for the noise series in hand, we can 


identify the noise model structure. We can also identify the 
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transfer functionestructure as’ra)tresult cof this) same 
development. Once the output series can be partitioned into 
transfer function and noise model contributions, each part 
can be identified. The noise model is identified via the 
previously considered autocorrelation, partial 
autocorrelation and power spectrum. The transfer function is 
identified via the impulse or step response. 

With an identified model in hand, we can proceed to 
parameter estimation. This is done via nonlinear least 
Squares. The algorithm used is a version of Gauss' method, 
called the Levenburg - Marquardt correction: 

EMoat= axl Cee, Mew Ta cCnwerekeleiatealo; (4.46) 
and we let: 

MSS ENA. (4,48) 
where v is a constant greater than 1. The Jacobian in the 
above equation can be estimated numerically, or calculated 
analytically. The library routine developed uses analytical 
calculations, since the model structure is fairly simple. 

The model, once obtained, is only a means to an end. It 
can be used, for example, to understand the behavior of the 
Dlantiwntiinethemrangemot they tCestmdatag Livecan alsogbe used 
for better control. The noise model can be used to 
anticipate the effects of correlated disturbances. By using 
it to derive a regulator, minimum variance about a setpoint 
can be obtained. Special methods are required where the 
variance of the regulator's output is too high, or when the 


process is nonminimum phase. The transfer function model can 
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of course, also be used to advantage, for servo control. The 
Simplest technique is to simply invert the transfer 
function,*which is the method used in the last chapter. 
There are much more robust approaches. 

What are the shortcomings of the method? 

The major problem is the requirement for on-line, 
open-loop testing of the plant, to obtain an input-output 
series. This is objectionable in an industrial environment 
where risks to people, the surrounding community, equipment 
and profitability all must be considered. The requirement 
for open-loop testing arises from the assumption that the 
input and noise are not cross-correlated. In closed-loop, 
planter ipUteiS amslLuncuion Ofb@controlserrorn mwhichmicme ine rurn 
correlated with the disturbances. 

What can the method contribute? 

Used as intended, the method is capable of delivering a 
transfer function plus noise model which can be used both 
for a basic understanding of the plant, and for improved 
Control. 

When compared to on-line methods, the most obvious 
difference is the emphasis on diagnostic checks. It is 
certainly possible to follow on-line the autocorrelation of 
the control error (analogous to residuals in the off-line 
Aualys1o)verhnemcontrol errorseshnouldsnocte bewautocorrelated 
if they are, there is some deticiency in the regulator. The 
autocorrelation of the control errors should be distributed 


as x?(p). And the power spectrum of ‘the control errors 
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should be white. 

Philosophically, the method has at least two things to 
contribute. First, it can be used as a stepping-off place to 
the topic of adaptive control. Secondly, it can be used as 
an introduction to stochastic processes. (e.g. Does the 
noise model represent the imperfect representation of the 
plant by the transfer function? Or does it represent 
something inherent in the system?) Both of these topics, to 
Say nothing of the related area of time series analysis, can 


profitably be studied at length. 
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APPENDIX A. TIME SERIES MODELLING 


In Chapter 3, it was suggested that the model: 

Vonks) GeO (zea) o (27990 alk) CAL ee 
Can generate a series of observations which appear to follow 
some pattern. However, this remark was made in the context 
of transfer function modelling. That is, we were 
considering: 

V(k) Slo 2ed) ol cees) Vze uk) ten tk) (A.2) 
where there is an implied causal relationship between u(k) 
and y(k). In contrast, models such as Equation (A.1) can be 
used to forecast a variable (e.g. a stock price) without 
Specifying any causal relationship to another observable. 
This is referred to as time series modelling. 

The subroutine package referred to previously provides 
the necessary tools for pure time series modelling. 

The modelling procedure is Similar to that for transfer 
function modelling. First, the autocorrelation, partial 
autocorrelation and power spectrum of the series must be 
considered, in order to find the model structure. 

Given the model structure, the next step is to estimate 
the model parameters. This process can be simplified by the 
availability of reasonable: initial guesses. 

Consider the mixed ARMA(p,q) process: 

MUR jmeno Glyn k-1) +... to (Dp) n (Kop) meta) oo (Gly) aK ole) ate, 
=a) auk=q) VAN) 
Multiplying by n(k-i) and taking expectations: 
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tiy(n,api)c@(1)y(n,a;i?-1)-..96(q)y(n,a,i-q) (A.4) 
Given: 


1>0 
i<0 


vii) = O.G1)y Cite) te. . + 6 Cp) yi tap) k2qt+1 
Iemma coulxe COrm: 


G@ = — 
where: 
® Fotis’. op) 
ip fon qa. scl esp))] 
[giz] 7 SOhG Hi= 3H?) rn ae, eee PD 


So that the initial estimates for the autoregessive 
parameters are readily available from these equations, known 
as the Yule-Walker equations. 
Once the estimates are available, we can rewrite 
Equation (A.3) as: 
—$(0)n(k)-¢(1)n(k-1)-...-6(p)n(k-p) = 
aGkji-é@(Pha(K>i)ar..-8(q)atk=q) o(0)=-1 
Let > 
Wik esga U)=0 (F)aCkal p-2a9-6( Gilat kyq) 
be the moving average part of the process, so that: 
wk) = -¢(0)n(k)-9(1)n(k-1)-...-$(p) n(k-p) 
Multiplying by w(k-i) and taking expectations: 


G(w,w,i) = ZZg(1)¢(m) c(n,n,|itl-m|) 


We also have: 
w(k) = a(k)-6(1)a(k-1)-...-@(q)a(k-q) atk) = N(0,o7(a)) 
which can be rewritten as: 


w(k) 
e(k) 
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(a) = 7 (0) 
Multiplying this by w(k-i) and taking expectations: 


aww, kr) ome tk) c= Oey 
1S ORS, G51 (AN6) 


Then the initial estimates of the moving average parameters 
are available by iterating on + (e.g. Newton-Raphson), to 


solve: 


For more details, see Wilson (1969b). 

Given the model structure, plus a starting point, we 
can perform a regression to estimate the parameters. But, 
for the class of models we have chosen, the sum of squares 
is nonlinear with respect to some of the parameters. 

Let [a(k)] denote El[a(k)], the expected value of the 
model residuals. For a pure autoregressive process: 

[a(k)] = o(z-')n(k) 
so that 

OlatkyA/ ogi) = —— (nt k-th oe (zee) ain CKiely oon) 
Korie) 0 MilGkK) 1) =) 1.0) and soln (kK) 700; =O lm So: 

dLa(k)]/d¢(i) = -n(k-i) 
which is linear in @¢. 
For a pure moving average process: 


[a(k)] = @-*(z2-')n(k) 
dla (k jel Kd O'Gi hia t-6- 329) in (k=5 $6 od Ge a poln(k)n706 (4) 


which is nonlinear in 6. 
Our approach will be a variation of Gauss' method: 
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where: 

X is the parameter vector 

J is the Jacobian of the objective function 

P is the vector of residuals 

The Marquardt-Levenburg correction is to substitute: 

(IF apmt eel bet Rigeet= iy 
for J'J above. This ensures that the Hessian will remain 
positive definite during the approach to the extremum. This 
ensures invertibility. It also requires discretion in our 
choice of Ao and v. Typically, A>» should be chosen to give 
an invtdaleconditionanumbernofeabout i, for ty Gee) The 
constant v should be chosen so that the condition number 
Stays below about 100. 

Box and Jenkins further stabilize the algorithm by 

using: 

Z = @=2%4(=)D(DITD) #001) 72 DIP 
where D is the inverse of the purely diagonal matrix 
composed of the square root of the diagonal of J. 

Of course, the objective function being minimized is 
the sum of squared residuals. However, we need more than 
juSteaachoicentoretherparametergvalues toy besablesto 
calculate the residuals. 

The general ARMA(p,q) model can be written: 

adk))= n(k)e-' (dn nik=9 etka ced p) nkkap) 

ray O Gals) akon el cee Chea KC) 
To start this model, we need the p values of n(k), and gq 


values of a(k) which occurred previous to the recorded 
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series. 
One possibility is to set the unknown values equal to 
their expected values: 
Black): |@=moemeE > iCk)4 ™aeg k<0 
A second possibility which assumes less is to start the 
calculation from k = pti, and set: 
E[a(k)] = 0 k<p 
The resulting loss of information is acceptable for large 
sample sizes. However, we do have the option of calculating 
the sum of squares without making any of these assumptions. 
The general ARMA(p,q) model can be written in two 
equivalent forms: 
Ot Zeek) =" b(zPeyatk) CAR a 
o(z)n(k) = 6(z)e(k) (A.8) 
We first use: 
fhe Ck leamaiz) nk) to (ty) leCkra sis eee cog) beck sq) 
to generate values for e(k), where: 
[e(k)] = 0 t>N-p 
Then: 
Pek) Pea" oC yn kei tll. +o (py ntktpy FP 6 zyetk) e(k)=0 , ks<0 
Pemusea tovgeneraresn(k) fOr K=U;es.., © because 0 (ze 901s 
stationary, these estimates approach zero at some instant, k 
=e -Onelastlyy 
Pak) amo lz t)n(RY+e(lyatkoi +. et elq)atk=g) 
aig) = Ome Koa 
is used to generate the residuals for t = -Q,...,N. As well 


as the objective function, we must evaluate the Jacobian. 
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Given the simplicity of the models being used, it is 


relatively easy to use analytical rather than approximate 


derivatives. Recall: 


€ (i) tpn (i) con hua Mae. ase (p)n (itp) 6461 Wie Cth aneté (q)e(itq) 


de(i) = dn(i) - ¢(1)0n(it1)-...-¢(p)don(itp) 
36(3) 2363) 3604) 363) 


+0(1)de(it1)+. )te(it 
30(4) a + TERE 
de(i) = dn(i) - ¢(1)dn(it1)-...-¢(p)don(itp) 
0¢(j) 0¢(3) 0¢(3) 0¢(3) 
re(ieetiti)+...+elalae tira) ae) 
0¢(35) eG 0¢(3 


But given the N observations of n(i): 


n(i)/2663 } J=Hen (i) 40d435) p=a0 Lede) cs. ee 


de(i) = 6(1)de(it+1)+...+O(q)de(itg)te(itj) 
AGH 36(4) 36(4) 


OC amon) Oe (ict) eee tn eS rae a) 
0¢(7) 0¢(7) 0¢(7) 


which can be used to backforecast the derivatives for i = 
Nappwe:, tebyousing the expectations: 
aliln= dels) /d6Lahceedetijfdetj me 0 i>N-p 
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for iff Opee.ea0e,eoavens 
on(i)}/d6(j) = dn(i)/3¢e(5) = 0 int.= rd), gehen 
e(i) = de(1)/d¢(j) = de(1)/d6(5) = 0 i<0 


Finally, the required derivatives can be forecast: 
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dali) =, dn(i)-¢(i)en(iri)-...-9(p)an(i-p) 
Haida (Meet FON) 0a Wliscye ii tian 
ays ae aT 
given: 
6n(i)/de(j) = dn(i)/oe(j) = 0 ist, ei ten 
a(i) = da(i)/d0(j) = da(i)/d¢e(j) = 0 i<-Q 
The use of these rather exacting calculations makes it 
possible to tackle small sample problems with confidence. 
After obtaining a set of parameter estimates, several 
tests can be applied to determine their adequacy. First, the 
plot of residuals should be inspected, looking for any 
unusual features. 
Second, the autocorrelation of the residuals should 
contain no significant terms. 
We can also consider the first i autocorrelations as a 
group. If the fitted model is appropriate: 
Q = Nizrj | Vu sl eeterer aes 
SEHOUlgebemdistributed as x \aopod)s 
Lastly, the power spectrum of the residuals should be 
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If more than one model can pass all the diagnostic 
tests, then the model with least parameters should be 
accepted. 

Time series models can be useful for highly complex 
systems, which do not yield initially to a more descriptive 
analysis. In such situations, their predictive power can be 
very useful. Using this approach, a wide variety of 
phenomenae can be investigated. But the basic assumption of 
the method is that the observed behavior is the result of a 
Fandom input, to a linear filter. So.any investigation of 
causal relationships will have to rely instead on something 
like transfer function modelling, with or without a 
stochastic element. This topic is covered in detail in the 


main body of this thesis. 
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APPENDIX B. TEST DATA 


This file contains data from the stirred tank heater, with 
inputs made to the steam valve every 5 seconds, and tank 
temperature read at the same time. The DP for the water flow 
was also read. The data is recorded below as: temperature 
(DegC), steam valve position (%), DP (mm. Hg), in format 
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